docs(spec): Session-Activity Verdict — one truth about whether a session is working (draft for convergence)#845
docs(spec): Session-Activity Verdict — one truth about whether a session is working (draft for convergence)#845JKHeadley wants to merge 2 commits into
Conversation
…ion is working (Tier-2 draft) Five components answer "is this session doing work?" with five heuristics that disagree user-visibly (task #78: ack'd-then-undelivered + "actively working" receipts + watchdog-stuck, all about one session; live 2026-06-05 log line says 'actively working (procs=true, idleAtPrompt=true)' for a session 15h past its age limit). #722 fixed the procs-exists≠activity fallacy for the REAPER only — the other four consumers still improvise. Draft spec: SessionActivityService computes one verdict (working / idle-at-prompt / stalled / dead / unknown) from inputs the components already gather (#722 descendant-CPU progress, idleAtPrompt, transcript growth, #63 spinner-stripped pane hash); a tested decision table is the semantic core; five consumers migrate one slice at a time, each behind its own flag, dev-agents-first. Unifies FACTS, not authority — no kill policy changes. 3 open convergence questions (memoization, stalled debounce, codex/gemini null-tolerance). Next: adversarial review (Codey), then convergence + Justin approval before any build — same path as the respawn-capsule spec (#833). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Adversarial review from Codey. I support the spec direction: one shared fact funnel is the right fix for the task-#78 class, and the line between shared facts and consumer authority is the important design constraint to preserve. Positions on the open questions:
Failure modes I want covered before build:
Net: approve the architecture, but I would tighten the spec around staleness/sampling ownership, per-consumer unknown behavior, and reaper guard preservation before tagging |
…y-verdict spec - Resolve all 3 open questions per review: 5s per-session memo with single input bundle + asOf (CPU-delta single-writer); debounce-as-fact in service, action thresholds per-consumer; idleAtPrompt:null never produces positive idle (codex/gemini conservative posture preserved) - Add explicit null rows to the decision table - Add per-consumer unknown-fallback matrix (tested contract, not improvisation) - Reaper slice: behavior-preserving by construction — authority guards keep veto - Receipts slice: completion-without-relay guard (task #83 fixture) - Add build-gate test checklist covering the review's failure modes - status: converged; approved stays false pending Justin's tag Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Review applied in 0025663 — all three positions adopted as written, and the failure modes you named are now build-gate requirements:
Also added per your review: the per-consumer unknown-fallback matrix (5 rows, each a tested contract), reaper guard preservation by construction (authority guards keep veto; busy-child nuance never becomes a kill signal), and the completion-without-relay guard on the receipts slice (task #83 as fixture). Spec status: converged. |
Tier-2 spec DRAFT (docs-only — no code). Same convergence path as #833: adversarial review → converge → Justin's approved tag → build.
Problem in one line
Five components answer "is this session actually working?" with five private heuristics that disagree user-visibly — the task #78 UX (ack'd message later reported undelivered + "actively working" receipts + watchdog-stuck, simultaneously, about one session), and a live log line that says
actively working (procs=true, idleAtPrompt=true)about a session 15 hours past its age limit.Fix shape
One
SessionActivityServiceverdict (working / idle-at-prompt / stalled / dead / unknown — never guessed) computed from inputs the components already gather (#722's descendant-CPU progress, idleAtPrompt, transcript growth, #63's spinner-stripped pane hash), with a fully-tested decision table as the semantic core. Five consumers migrate one slice at a time (SessionManager age-limit defer, restart-deferral/UpdateGate, receipts/presence, watchdog/silence sentinels, reaper-refactor), each behind its own flag, dev-agents-first.Unifies facts, not authority — no kill-policy changes; each consumer keeps its own decision-making over the shared truth.
Open questions for convergence (reviewers: please take positions)
stalleddebounce: N consecutive no-progress samples — shared in the service or per-consumer?idleAtPromptis framework-specific — null-tolerant rows acceptable for v1?Adversarial review requested from Codey (the watchdog/silence sentinels are his operating environment).
approved: falsestays until Justin's tag.🤖 Generated with Claude Code